167 research outputs found

    Biased amino acid composition in warm-blooded animals

    Get PDF
    Among eubacteria and archeabacteria, amino acid composition is correlated with habitat temperatures. In particular, species living at high temperatures have proteins enriched in the amino acids E-R-K and depleted in D-N-Q-T-S-H-A. Here, we show that this bias is a proteome-wide effect in prokaryotes, and that the same trend is observed in fully sequenced mammals and chicken compared to cold-blooded vertebrates (Reptilia, Amphibia and fish). Thus, warm-blooded vertebrates likely experienced genome-wide weak positive selection on amino acid composition to increase protein thermostability

    Transcriptional coupling of neighbouring genes and gene expression noise: evidence that gene orientation and non-coding transcripts are modulators of noise

    Get PDF
    For some genes, notably essential genes, expression when expression is needed is vital hence low noise in expression is favourable. For others noise is necessary for coping with stochasticity or for providing dice-like mechanisms to control cell fate. But how is noise in gene expression modulated? We hypothesise that gene orientation may be crucial, as for divergently organized gene pairs expression of one gene could affect chromatin of a neighbour thereby reducing noise. Transcription of antisense non-coding RNA from a shared promoter is similarly argued to be a noise-reduction mechanism. Stochastic simulation models confirm the expectation. The model correctly predicts: that protein coding genes with bi-promoter architecture, including those with a ncRNA partner, have lower noise than other genes; divergent gene pairs uniquely have correlated expression noise; distance between promoters predicts noise; ncRNA divergent transcripts are associated with genes that a priori would be under selection for low noise; essential genes reside in divergent orientation more than expected; bi-promoter pairs are rare subtelomerically, cluster together and are enriched in essential gene clusters. We conclude that gene orientation and transcription of ncRNAs, even if unstable, are candidate modulators of noise levels

    Evidence of widespread degradation of gene control regions in hominid genomes

    Get PDF
    Although sequences containing regulatory elements located close to protein-coding genes are often only weakly conserved during evolution, comparisons of rodent genomes have implied that these sequences are subject to some selective constraints. Evolutionary conservation is particularly apparent upstream of coding sequences and in first introns, regions that are enriched for regulatory elements. By comparing the human and chimpanzee genomes, we show here that there is almost no evidence for conservation in these regions in hominids. Furthermore, we show that gene expression is diverging more rapidly in hominids than in murids per unit of neutral sequence divergence. By combining data on polymorphism levels in human noncoding DNA and the corresponding human¿chimpanzee divergence, we show that the proportion of adaptive substitutions in these regions in hominids is very low. It therefore seems likely that the lack of conservation and increased rate of gene expression divergence are caused by a reduction in the effectiveness of natural selection against deleterious mutations because of the low effective population sizes of hominids. This has resulted in the accumulation of a large number of deleterious mutations in sequences containing gene control elements and hence a widespread degradation of the genome during the evolution of humans and chimpanzees

    Amino acid composition in endothermic vertebrates is biased in the same direction as in thermophilic prokaryotes

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Among bacteria and archaea, amino acid usage is correlated with habitat temperatures. In particular, protein surfaces in species thriving at higher temperatures appear to be enriched in amino acids that stabilize protein structure and depleted in amino acids that decrease thermostability. Does this observation reflect a causal relationship, or could the apparent trend be caused by phylogenetic relatedness among sampled organisms living at different temperatures? And do proteins from endothermic and exothermic vertebrates show similar differences?</p> <p>Results</p> <p>We find that the observed correlations between the frequencies of individual amino acids and prokaryotic habitat temperature are strongly influenced by evolutionary relatedness between the species analysed; however, a proteome-wide bias towards increased thermostability remains after controlling for phylogeny. Do eukaryotes show similar effects of thermal adaptation? A small shift of amino acid usage in the expected direction is observed in endothermic ('warm-blooded') mammals and chicken compared to ectothermic ('cold-blooded') vertebrates with lower body temperatures; this shift is not simply explained by nucleotide usage biases.</p> <p>Conclusion</p> <p>Protein homologs operating at different temperatures have different amino acid composition, both in prokaryotes and in vertebrates. Thus, during the transition from ectothermic to endothermic life styles, the ancestors of mammals and of birds may have experienced weak genome-wide positive selection to increase the thermostability of their proteins.</p

    The Effects of Network Neighbours on Protein Evolution

    Get PDF
    Interacting proteins may often experience similar selection pressures. Thus, we may expect that neighbouring proteins in biological interaction networks evolve at similar rates. This has been previously shown for protein-protein interaction networks. Similarly, we find correlated rates of evolution of neighbours in networks based on co-expression, metabolism, and synthetic lethal genetic interactions. While the correlations are statistically significant, their magnitude is small, with network effects explaining only between 2% and 7% of the variation. The strongest known predictor of the rate of protein evolution remains expression level. We confirmed the previous observation that similar expression levels of neighbours indeed explain their similar evolution rates in protein-protein networks, and showed that the same is true for metabolic networks. In co-expression and synthetic lethal genetic interaction networks, however, neighbouring genes still show somewhat similar evolutionary rates even after simultaneously controlling for expression level, gene essentiality and gene length. Thus, similar expression levels and related functions (as inferred from co-expression and synthetic lethal interactions) seem to explain correlated evolutionary rates of network neighbours across all currently available types of biological networks

    Genome-wide acceleration of protein evolution in flies (Diptera)

    Get PDF
    BACKGROUND: The rate of molecular evolution varies widely between proteins, both within and among lineages. To what extent is this variation influenced by genome-wide, lineage-specific effects? To answer this question, we assess the rate variation between insect lineages for a large number of orthologous genes. RESULTS: When compared to the beetle Tribolium castaneum, we find that the stem lineage of flies and mosquitoes (Diptera) has experienced on average a 3-fold increase in the rate of evolution. Pairwise gene comparisons between Drosophila and Tribolium show a high correlation between evolutionary rates of orthologous proteins. CONCLUSION: Gene specific divergence rates remain roughly constant over long evolutionary times, modulated by genome-wide, lineage-specific effects. Among the insects analysed so far, it appears that the Tribolium genes show the lowest rates of divergence. This has the practical consequence that homology searches for human genes yield significantly better matches in Tribolium than in Drosophila. We therefore suggest that Tribolium is better suited for comparisons between phyla than the widely employed dipterans

    Deep learning allows genome-scale prediction of Michaelis constants from structural features

    Get PDF
    AU The:Michaelis Pleaseconfirmthatallheadinglevelsarerepresentedcorrectly constant KM describes the affinity of an enzyme : for a specific substrate and is a central parameter in studies of enzyme kinetics and cellular physiology. As measurements of KM are often difficult and time-consuming, experimental estimates exist for only a minority of enzyme–substrate combinations even in model organisms. Here, we build and train an organism-independent model that successfully predicts KM values for natural enzyme–substrate combinations using machine and deep learning methods. Predictions are based on a task-specific molecular fingerprint of the substrate, generated using a graph neural network, and on a deep numerical representation of the enzyme’s amino acid sequence. We provide genome-scale KM predictions for 47 model organisms, which can be used to approximately relate metabolite concentrations to cellular physiology and to aid in the parameterization of kinetic models of cellular metabolism

    A general model to predict small molecule substrates of enzymes based on machine and deep learning

    Get PDF
    For most proteins annotated as enzymes, it is unknown which primary and/or secondary reactions they catalyze. Experimental characterizations of potential substrates are time-consuming and costly. Machine learning predictions could provide an efficient alternative, but are hampered by a lack of information regarding enzyme non-substrates, as available training data comprises mainly positive examples. Here, we present ESP, a general machine-learning model for the prediction of enzyme-substrate pairs with an accuracy of over 91% on independent and diverse test data. ESP can be applied successfully across widely different enzymes and a broad range of metabolites included in the training data, outperforming models designed for individual, well-studied enzyme families. ESP represents enzymes through a modified transformer model, and is trained on data augmented with randomly sampled small molecules assigned as non-substrates. By facilitating easy in silico testing of potential substrates, the ESP web server may support both basic and applied science
    corecore